Skip to content

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Oct 8, 2025

@vstinner vstinner marked this pull request as draft October 11, 2025 21:57
@vstinner
Copy link
Member Author

I convert this PR to a draft for now since it seems like the API is misused by 3rd party projects, and I proposed PyDict_FromItems() which is a different abstraction: #139963

@vstinner vstinner force-pushed the dict_presized branch 3 times, most recently from eb555c6 to 8bb9715 Compare October 12, 2025 12:40
@vstinner
Copy link
Member Author

I rewrote the PR to add unicode_keys parameters: PyObject* PyDict_NewPresized(Py_ssize_t size, int unicode_keys).

@methane
Copy link
Member

methane commented Oct 13, 2025

There are two news entries.

@vstinner
Copy link
Member Author

vstinner commented Oct 13, 2025

Benchmark on PyDict_New() vs PyDict_NewPresized() with Unicode keys:

Benchmark new presized
dict-10 2.69 us 2.62 us: 1.03x faster
dict-100 29.6 us 27.5 us: 1.08x faster
dict-1,000 301 us 283 us: 1.06x faster
dict-10,000 3.50 ms 3.18 ms: 1.10x faster
Geometric mean (ref) 1.05x faster

Benchmark hidden because not significant (1): dict-1

Code:

diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index 4e73be20e1b..a1eaed01178 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -2562,6 +2562,77 @@ toggle_reftrace_printer(PyObject *ob, PyObject *arg)
     Py_RETURN_NONE;
 }
 
+
+static PyObject *
+bench_dict_new(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t size, loops;
+    if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t loop=0; loop < loops; loop++) {
+        PyObject *d = PyDict_New();
+        if (d == NULL) {
+            return NULL;
+        }
+
+        for (Py_ssize_t i=0; i < size; i++) {
+            PyObject *key = PyUnicode_FromFormat("%zi", i);
+            assert(key != NULL);
+
+            PyObject *value = PyLong_FromLong(i);
+            assert(value != NULL);
+
+            assert(PyDict_SetItem(d, key, value) == 0);
+        }
+
+        assert(PyDict_Size(d) == size);
+        Py_DECREF(d);
+    }
+    PyTime_PerfCounterRaw(&t2);
+
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+
+static PyObject *
+bench_dict_presized(PyObject *ob, PyObject *args)
+{
+    Py_ssize_t size, loops;
+    if (!PyArg_ParseTuple(args, "nn", &size, &loops)) {
+        return NULL;
+    }
+
+    PyTime_t t1, t2;
+    PyTime_PerfCounterRaw(&t1);
+    for (Py_ssize_t loop=0; loop < loops; loop++) {
+        PyObject *d = PyDict_NewPresized(size, 1);
+        if (d == NULL) {
+            return NULL;
+        }
+
+        for (Py_ssize_t i=0; i < size; i++) {
+            PyObject *key = PyUnicode_FromFormat("%zi", i);
+            assert(key != NULL);
+
+            PyObject *value = PyLong_FromLong(i);
+            assert(value != NULL);
+
+            assert(PyDict_SetItem(d, key, value) == 0);
+        }
+
+        assert(PyDict_Size(d) == size);
+        Py_DECREF(d);
+    }
+    PyTime_PerfCounterRaw(&t2);
+
+    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
+}
+
+
 static PyMethodDef TestMethods[] = {
     {"set_errno",               set_errno,                       METH_VARARGS},
     {"test_config",             test_config,                     METH_NOARGS},
@@ -2656,6 +2727,8 @@ static PyMethodDef TestMethods[] = {
     {"test_atexit", test_atexit, METH_NOARGS},
     {"code_offset_to_line", _PyCFunction_CAST(code_offset_to_line), METH_FASTCALL},
     {"toggle_reftrace_printer", toggle_reftrace_printer, METH_O},
+    {"bench_dict_new", bench_dict_new, METH_VARARGS},
+    {"bench_dict_presized", bench_dict_presized, METH_VARARGS},
     {NULL, NULL} /* sentinel */
 };
 

bench_new.py:

import pyperf
import functools
import _testcapi
runner = pyperf.Runner()
for size in (1, 10, 100, 1_000, 10_000):
    func = functools.partial(_testcapi.bench_dict_new, size)
    runner.bench_time_func(f'dict-{size:,}', func)

bench_presized.py:

import pyperf
import functools
import _testcapi
runner = pyperf.Runner()
for size in (1, 10, 100, 1_000, 10_000):
    func = functools.partial(_testcapi.bench_dict_presized, size)
    runner.bench_time_func(f'dict-{size:,}', func)

@vstinner
Copy link
Member Author

I created capi-workgroup/decisions#80 to the C API Working Group for this API.

@vstinner
Copy link
Member Author

Benchmark on PyDict_New() vs PyDict_NewPresized() with integer keys:

Benchmark new presized
dict-1 294 ns 301 ns: 1.02x slower
dict-10 2.61 us 2.51 us: 1.04x faster
dict-100 26.1 us 24.8 us: 1.05x faster
dict-1,000 260 us 250 us: 1.04x faster
dict-10,000 3.07 ms 2.78 ms: 1.10x faster
Geometric mean (ref) 1.04x faster

@davidhewitt
Copy link
Contributor

This seems useful to me for PyO3 👍

I am unsure how reliably we will be able to use the unicode_keys hint. My feeling is that it might be the case that in cases where we're confident about the key types we would have been able to use the proposed PyDict_FromItems.

@vstinner
Copy link
Member Author

I am unsure how reliably we will be able to use the unicode_keys hint. My feeling is that it might be the case that in cases where we're confident about the key types we would have been able to use the proposed PyDict_FromItems.

Correct.

If you know your input data, you can set the unicode_keys hint in advance, before consuming the iterator. You can use PyDict_NewPresized() in this case.

If you don't know your input data, you might need to consume the iterator and store keys and values in a temporary array, and then call PyDict_FromItems() which computes the unicode_keys hint for you.

@davidhewitt
Copy link
Contributor

I think this seems the wrong way around for me as a user; if I don't know my input data I'd rather not collect it to a temporary array, it could be a large dataset which would be a big temporary allocation.

If I know the input data, I was thinking I would even be able to allocate the items in stack memory before calling PyDict_FromItems.

@davidhewitt
Copy link
Contributor

Or are you saying that it is more efficient to use PyDict_NewPresized and repeated calls to PyDict_SetItem than to use PyDict_FromItems?

@vstinner
Copy link
Member Author

Or are you saying that it is more efficient to use PyDict_NewPresized and repeated calls to PyDict_SetItem than to use PyDict_FromItems?

Oh, I don't know which function is faster. So I ran benchmarks: #139963 (comment). PyDict_FromItems() is faster than PyDict_NewPresized()+PyDict_SetItem().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants